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ABSTRACT 

Based  on  the  fast  parallel  matrix  multiplication  scheme 
of  Krishnamurthy  and  Klette,  O(log  m)  step  algorithms  using 
m  matrix  processors  are  described  for  the  exact  determination 
of  the  Moore-Penrose  generalized  inverse  and  the  rank  of  an 
(mxm)  matrix  with  integer  entries. 
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1.  Introduction 


In  a  recent  paper,  Krishnamurthy  and  Klette  [1]  have  shown 

that  the  exact  product  of  two  (m*m)  matrices  having  integer 

elements  with  e-bit  precision  can  be  obtained  in  the  MIMD 

mode  with  complexity  y=0(log  r  log  e  +  (log  em) (max  log  e.)) 

i  1 

using  prime  moduli  arithmetic  with  r  primes,  each  of  precision 
e^  bits.  (All  logarithms  are  taken  to  base  2.) 

Based  on  this  parallel  scheme  for  multiplication,  we  describe 
here  a  parallel  method  with  m  such  matrix  processors  to  deter¬ 
mine  exactly,  in  O(log  m)  steps,  the  rank  and  the  generalized 
inverse  of  a  rectangular  (mxn)  matrix  with  integral  entries. 

This  can  be  extended  to  matrices  with  complex  number  entries . 
However,  this  procedure  is  in  general  invalid  for  the  determina¬ 
tion  of  the  rank  of  a  matrix  over  a  finite  field. 

Since  the  rank  of  a  matrix  is  very  sensitive  to  errors  in 
computation  (especially  with  matrices  which  are  ill-conditioned) , 
throughout  this  paper  our  discussion  will  be  confined  to  exact 
computational  procedures  using  residue  arithmetric.  For  the 
principles  and  practice  of  these  procedures  readers  are  referred 
to  the  papers  by  Krishnamurthy,  Rao  and  Subramanian  [2]  and 
Krishnamurthy  [3] . 


2.  Principle 


The  computation  of  the  g-inverse  and  rank  of  a  rectangular 

matrix  is  cased  on  the  following  theorem  [2,4]: 

Theorem.  Let  A  be  any  m*n  matrix  with  real  entries.  Let 
in— 

B(X)  =  (Xm+a^Xm”J'+. .  ,+am)  be  the  characteristic  polynomial  of 
B=AAfc (Afc  is  the  transpose  of  A) .  If  k^O  is  the  largest  inte¬ 
ger  such  that  a^O  then  the  Moore-Penrose  inverse  of  A  is 

A+  =  -ak_1  At[CAAt)k_1  +  ...  +  a^I]  (1) 

If  k=0  is  the  largest  integer  such  that  ak^0,  then  A+=0. 

Also,  the  rank  of  A  is  k.  || 

Based  on  this  theorem  an  algorithm  is  described  in  [2]  for 
computing  the  exact  generalized  inverse  of  A  and  its  rank.  In 
this  paper  we  describe  a  parallel  version  of  this  algoritnm 
using  the  processor  model  described  in  [1] . 

Computation  of  the  rank  and  g-inverse  proceeds  tnrough 
the  following  steps: 

1 .  Computing  B=AAfc 

2.  Finding  the  characteristic  equation  of  B  and  computing 
A+  using  (1) .  This  can  be  done  by  computing  the  coeffi¬ 
cients  a^  of  B ( X)  using  Leverrier's  method  [5],  as 
shown  below: 

Let  X,,X0,...Xm  be  the  characteristic  roots  of  is  and  let 
12  m 

m  k 

S.  =  I  X  l£k£m 

*  i-i  1 

then  S,  =  trace {B*)  for  lMkim 


and 


1  0 

S1  2 

S2  S1 
s3  s2 


sm  ,  s 

m-i 


(2) 


which  in  matrix  form  can  be  expressed  as 

Me  =  S 

Equation  (2)  can  be  proved  by  using  the  well  known  1-Jewton's 
identities  [5]. 

The  calculation  of  a^  from  (2)  and  determination  of  A+  pro¬ 
ceeds  as  follows: 

i)  Computation  of  S^,  lsksm: 

2 

This  requires  computation  of  the  powers  of  B,  viz.,  B  , 

. . . ,Bk, . . . ,Bm,  and  the  computation  of  the  trace  of  each 
Bk  (l<k<m) . 

ii)  Computation  of  M-^: 

The  determination  of  and  hence  the  rank  k  requires 
the  computation  of  the  inverse  of  the  non-singular  tri¬ 
angular  matrix  M. 

iii)  Determination  of  M~^S 

iv)  Determination  of  A+  using  (1) . 


% 


3.  Complexit 


We  now  proceed  to  compute  the  complexity.  For  this 
purpose  we  choose  as  the  unit  of  measurement,  the  matrix  multipli¬ 
cation  time  (\i)  and  matrix  addition  time  (a)  each  with  com¬ 
plexity  as  defined  in  [1] .  We  also  assume  that  by  a  processor 
we  mean  a  single  matrix  processor  which  can  multiply  two  matrices 
in  time  u  or  add  in  time  a.  The  total  complexity  can  therefore 
be  computed  in  terms  of  the  basic  operations  using  these  and 
the  definitions  in  [1]  . 

]{ 

The  computation  of  powers  B  requires  log  m  parallel  steps 
with  m/2  matrix  processors  each  performing  a  matrix  multiplica¬ 
tion  in  time  y .  The  computation  of  the  trace  requires  log  m 

lr 

steps  of  addition  time  for  m  numbers  in  each  B  ;  this  can  be 


done  with  m  matrix  processors  in  parallel. 

For  example,  if  m=16,  with  8  processors  the  number  of 


required  steps  is  log216=4,  and  at  each  step  the  calculations 
are  organized  as  follows: 


Time 


Processors 


The  computation  of  the  inverse  of  the  non-singular  triangular 
matrix  M  is  carried  out  using  a  formula  similar  to  (1) .  Since 
M  is  a  triangular  matrix,  its  eigenvalues  are  simply  the  diagonal 
elements  l,2,...,m;  therefore,  the  coefficients  of  the  characteristic 
polynomial  of  M,  namely  M(X)  *  Xm+hjXm“1+b2^m”2+.  •  .+bm=0 


(3) 


r 


can  be  precomputed  once  and  for  all  and  stored.  Using  (3) 
and  the  Cayley-Harailton  theorem,  we  can  write 

M-1  =  - 

m 

This  can  be  computed  in  2+log  m  steps  with  m  matrix  processors 
each  performing  matrix  multiplication  in  time  y,  and  log  m 
steps  of  addition  in  time  a.  We  then  need  to  compute  M-1S. 

Thus  the  rank  can  be  computed  in 

2(y+ot)log  m  +  3y  =  O(log  m) 
matrix  multiplication  steps. 

The  determination  of  A+  can  be  carried  out  from  the  stored 
k 

values  of  B  and  accumulation  of  these  multiplied  by  the  coeffi¬ 
cients  as  in  (1) ;  this  is  then  multiplied  by  Afc  and  divided  by  a^ 
This  requires  log  m  matrix  addition  steps  and  3  matrix  multiplica 
tion  steps,  using  m  matrix  processors. 

Thus  the  computation  of  A+  takes  (2y+3a)log  m  +  6y  time  com¬ 
plexity  using  m  matrix  processors. 


Concluding  remarks 

(i)  The  above  algorithm  fails  for  a  matrix  over  a  finite 
field,  since  the  rank  of  AA^  is  not  in  general  equal 
to  the  rank  of  A.  Also  if  we  want  to  use  a  character¬ 
istic  equation  method  by  directly  computing  the  poly¬ 
nomial  of  A,  we  cannot,  in  general,  say  that  the  rank 
of  the  matrix  is  equal  to  the  number  of  non-zero  char¬ 
acteristic  roots.  Further,  even  in  the  special  case 
where  rank  AAfc  equals  rank  A,  equation  (2)  is  not 
solvable  over  the  finite  field.  For  instance,  over 
GF(2)  the  diagonal  elements  of  M  are  alternatively  1 
and  0  and  hence  the  coefficients  (except  a^)  cannot 
be  determined.  Even  for  this  special  case,  over  GF(p), 
this  procedure  can  determine  at  most  the  rank  of  a 
(p-lxp-1)  matrix.  Hence  this  algorithm  cannot  be  used 
for  matrices  over  a  finite  field  which  occur  in  graph 
theory,  coding  theory  and  other  areas  of  computer  science. 


(ii)  The  algorithm  can  be  used  to  compute  the  inverses  of 
polynomial  matrices  [3]. 
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